Skip to content

Conversation

@rtjd6554
Copy link
Collaborator

@rtjd6554 rtjd6554 commented Nov 27, 2025

Make sure you have checked all steps below.

Issue

  • My PR fully resolves the following issues. I've referenced an issue in the PR title, for example "Issue 1234 - My
    Feature". Note that before an issue is finished, you can still make a pull request by raising a separate issue
    for your progress.

Tests

  • My PR adds the following tests based on our test strategy OR does not need testing for this extremely good reason:
    • Existing execution of test suite
    • New tests for classes created: ByteArrayTest

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it, or I have linked to a
    separate issue for that below.
  • If I have added new Java code, I have added Javadoc that explains it following our conventions and style.
  • If I have added or removed any dependencies from the project, I have updated the NOTICES file.

@m09526
Copy link
Member

m09526 commented Nov 28, 2025

Could we replace this with a commons lang3 Pair? They have ImmutablePair and MutablePair as required.

@patchwork01
Copy link
Collaborator

Personally I'd rather avoid using a Pair-type data structure at all, in favour of something with a more descriptive name. Using a Java record lets you achieve that with very little boilerplate, something like this:

public record FieldValue<T>(String fieldName, T value) {
}

I'd keep that in the Athena code, next to where it's used, maybe nest it in the class and make it static.

@rtjd6554 rtjd6554 marked this pull request as ready for review December 2, 2025 15:34
@rtjd6554 rtjd6554 added the needs-reviewer Pull requests that need a reviewer to be assigned label Dec 2, 2025
@ca61688 ca61688 self-assigned this Dec 2, 2025
@ca61688 ca61688 removed the needs-reviewer Pull requests that need a reviewer to be assigned label Dec 2, 2025
*/
package sleeper.athena.record;

public record FieldAtDimension(int dimension, Object value) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is field AT dimension whereas other is field AS String. Is that correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed typo

*
* @param array1 first ByteArray object
* @param array2 second ByteArray object
* @return true is equal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true if* equal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed typo

* Method for checking equality versus generic object declaration.
*
* @param o object to compare versus
* @return true is equal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true if* equal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed typo


import static org.assertj.core.api.Assertions.assertThat;

public class ByteArrayTest {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth testing equals validation for one null one not and for the contents of the arrays being equal?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

Copy link
Collaborator

@ca61688 ca61688 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small comments otherwise looks good

@ca61688 ca61688 assigned rtjd6554 and unassigned ca61688 Dec 2, 2025
@ca61688 ca61688 assigned rtjd6554 and unassigned ca61688 Dec 3, 2025
}

/**
* Method for checking equaility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo on equaility - should be equality

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

Copy link
Collaborator

@patchwork01 patchwork01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Facebook have the same license as this project. I'm basing this review on my own reading of the terms:

https://github.com/facebookarchive/jcommon/blob/master/LICENSE


/**
* Class to provide wrapper object for primitive byte array into an object.
* Require as hashCode and equals requied for usage of type within a HashSet.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the Oracle Javadoc style guide:

https://www.oracle.com/uk/technical-resources/articles/java/javadoc-tool.html#styleguide

A class description shouldn't start with "class", see "Class/interface/field descriptions can omit the subject and simply state the object." If we take the first sentence as it is, it could start with "A wrapper for ...".

There's a typo at "requied".

There are grammatical problems at "for primitive", "Require as", "equals required", "usage of type".

It's not technically true that by wrapping a byte array we turn it into an object. A byte array is already an object.

I don't think HashSet is relevant to this class?

I think the important part for us is it implements Comparable and equals/hashCode.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed,

disagree on the HashSet comment as otherwise we could just use the primitive here, it is directly because we use in within HashSets that we need it as a separate class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what you mean, we don't use it in HashSets.

}

@Test
void shouldValidateEqualsMethodWithNullObjects() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test name should describe the behaviour being asserted, rather than name the method being tested.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

ByteArray valid = ByteArray.wrap(new byte[]{'a', 2, 3});

// When / Then
assertThat(ByteArray.equals(valid, null)).isFalse();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we need a version of these tests for the instance method as well?

I don't think we ever use ByteArray.equals(ByteArray, ByteArray), so it might be simpler to get rid of that and only keep the instance method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment on ByteArray.equals(ByteArray, ByteArray).

@rtjd6554 rtjd6554 removed their assignment Dec 4, 2025
@rtjd6554 rtjd6554 added the needs-reviewer Pull requests that need a reviewer to be assigned label Dec 4, 2025
NOTICES Outdated


The file sleeper.core.util.ByteArray.java contains code that is heavily based on ByteArray from the Facebook Jcommon
collections library, licensed under the Apache License, Version 2.0.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the paragraph below we've used a relative file system path. It might be worth being consistent with that, rather than using the package name.

I think the library name is capitalised "Facebook JCommon Collections", although the only place I could find it not in all lower case is here:
https://mvnrepository.com/artifact/com.facebook.jcommon/collections

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

package sleeper.athena.record;

/**
* Storage of key pairing for field and value.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what "key pairing" means here. It's a value for a field, held as a string along with the name of the field.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

package sleeper.athena.record;

/**
* Storage of key pairing of dimension and value.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what "key pairing" means here. It's a value for a row key field, held along with the dimension of the row key, which is its index in the list of row keys in the schema. I'd include an explanation of what a dimension is here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

* Storage of key pairing for field and value.
*
* @param fieldName field name
* @param value value stored
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it reads a bit oddly to omit "the" in these phrases. I think "value stored" is a bit vague, given it's not the actual value in the format we use elsewhere.

I'd say "the field name" and "a string representation of the value".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

* Storage of key pairing of dimension and value.
*
* @param dimension dimension key
* @param value value stored
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it reads a bit oddly to omit "the" in these phrases. I don't know what "key" means in "dimension key".

I'd say "the index of the row key in the schema" and "the value".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

ByteArray valid = ByteArray.wrap(new byte[]{3, 'd', 5});

// When / Then
assertThat(valid.equals(ByteArray.wrap(new byte[]{3, 'd', 5}))).isTrue();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to duplicate the test above, shouldMatchByteArraysWhenWrappedUsingSameData.

Comment on lines +107 to +114
@Test
void shouldCorrectlySortByteArrays() {
// Given
ByteArray first = ByteArray.wrap(new byte[]{1, 2, 3});
ByteArray second = ByteArray.wrap(new byte[]{4, 5, 6, 7, 8});

// When / Then
assertThat(first).isLessThan(second);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name suggests this is testing more than it is. We need more than this one case to test sorting.

It'd be good to make nested test classes for equality and for sorting.


// Then 2
assertThat(treeSet).containsExactly(first);
assertThat(treeSet).doesNotContain(second);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The containsExactly assertion already asserts that it doesn't contain the second byte array.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed


import static org.assertj.core.api.Assertions.assertThat;

public class ByteArrayTest {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you establish a consistent naming scheme for tests that do similar things?

It's a bit difficult to see the intention right now. Nested test classes could help as well.

For example, there's shouldFindByteArraysAreEqualWithSameSourceArray (my wording) and shouldMatchByteArraysWhenWrappedUsingSameData. These tests are extremely similar, but it's hard to tell because the names have quite different wording.


import static org.assertj.core.api.Assertions.assertThat;

public class ByteArrayTest {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a number of tests that use vague names as in my previous comments (e.g. #6129 (comment)).

Several tests use "shouldValidate..." when they're not about validation.

The "shouldReturnCorrectComparison" and "shouldProvideCorrectComparison" don't explain what correct means, or which behaviour you wanted to assert.

@patchwork01 patchwork01 removed the needs-reviewer Pull requests that need a reviewer to be assigned label Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove Facebook collections library

6 participants